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ON THE MATERIAL DURING THE PROCESS OF LEARNING. AN 
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Introduction 



Attention to the behavior of Individual subjects (£s) certainly 
Is not a new Idea, but despite arguments for It, group experiments have 
predominated In the verbal learning area. Further, about the only 
experiments that have been concerned with structure and patterning In 
learning have been those concerned with clustering In free recall. But 
these experiments have also been primarily concerned with group performance. 
Typically, they have manipulated aspects of the stimuli In terms of prior 
norms generated by prior groups of Ss and then scored the behavior of 
the experimental Ss In terms of whether It did or did not conform to 
what the experimenter (E) had "bullt-ln" via the norms. A concept 
attainment view as differentiated from a concept formation one (Bruner, 
et. ^., 1957, pp. 21-22 and 44). Clustering on the part of Ss which 
did not conform to that which E had built In was usually Ignored or at 
best mentioned In passing. Until very recently the only studies directly 
concerned with subjective organizations In learning have been those of 
Tulvlng (1962), and of Carterette and Coleman (1963). These studies, 
however, have been limited to examination of the degree to which ^s 
report words In the same order from trial to trial (by means of a measure 
called SO), and have been completely Insensitive to clustering unless 
It was accompanied by relatively rigid sequencing. 

Another approach to the analysis of subjective organization Is 
exemplified In a study recently reported by Marshall (1967) . This 
study classified clustering Into two kinds: a) experimenter-defined, 

and b) Idiosyncratic, but did so by means of a post-experiment recognition 
association test. Handler (1967) reports a series of studies In which ^ 
subjectively organized the material prior to being tested for learning. 

The subjective clustering of Individual S^s during learning has not been 
studied, except In terms of the SO (or a closely related) measure. 

The present report describes an experimental paradigm designed to 
measure subjective clustering of Individual ^s during learning, two 
experiments employing that paradigm, and a third experiment concerned 
with a particular Implication of subjective organization behavior. 



The Experimental "Paradigm 

The basic procedural elements of the paradigm are the following: 
1. A presentation period during which Items to be memorized are pre- 
sented one at a time (five sec. per word In the present studies). In a 
random (or pseudo-random) order. During this period the ^s are required 
to write each Item down as they see (or hear) It on a specially prepared 



study sheet. This study sheet contains nothing but an array of blank 
cells (e.g., on an 8 1/2 " x 14" sheet of paper with a matkix of 12 
columns and 28 rows) . Each word may be written in any cell in the 
array but only one word per cell. 2. (After completion of the pre- 
sentation.) A study period (one to one and a half minutes in the 
present experiments) during which the £ is allowed to study his 
personally created study sheet. 3. A test period (a single block 
of time equivalent to four to four and a half sec. per word In the 
present experiments) during which the S Is Instructed to write the 
words In a list in the order In which they occur to him. 4. Repeats 
of the preceding three steps (in the present experiments either four or 
five repeats). Each repeat Involves a new random order of presentation 
and new study and test sheets, old sheets having been removed at the end 
of the appropriate periods. Current experiments also Incorporate a one 
and one half minute "pre-look" at a randomly arranged simultaneous pre- 
sentation of all of the words to be memorized (words randomly arranged 
on a study sheet). This "pre-look" precedes the first presentation 
only. Instructions to the ^s suggest no particular mode of organization 
on their study sheets, but do say "arrange the words on the study sheet 
to best help you memorize." 

The first experiment also Included a post-experiment period during 
which each ^ indicated (by bracketing and labeling) how he attempted to 
organize the Items and why. This last step was solely for checking on 
the "validity" of the objective scoring procedures applied to the sub- 
jective organizations on the study sheets. With rectangular arrays of 
cells on the study sheets, Ss employ horizontal or vertical (rarely 
mixed) lists on their study sheets. Within these overall orientations 
(which may be established objectively, cf. Experiment Two) simple 
adjacency on the study sheet is an adequate (and "valid") criterion for 
defining "belonging to the same organizational unit." In other words, 
if an ^ organizes vertically, e.g., then each column of adjacent words 
in the matrix contains a cluster so far as the _S is concerned. Thus, 
subjective clusters may be objectively defined on the basis of perform^ 
ance during the leattiiiig experiment . 

The Three Experiments 

Experiment One 

This was the first test of the paradigm. It was anticipated 
that the paradigm might provide for a "look inside the in what was 
otherwise a typical free-recall experiment. Hence, a control group 
(Control 2) was included that learned under "typical free-recall" 
conditions. Another control group (Control 1) was also included to 
check on the possible effects of the overt "organizing" activity 
expected of ^s using the study sheets. 

Method for Experiment One . Three groups of ^s each learned forty 
words. Thirty-four of the words were taken from Underwood and Richard- 
son's norms (1956) and consisted of four categories of high dominance 
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and four categories of low dominance words, and four words In a 
miscellaneous category such that there were minimal relations amongst 
those words and between those words and the eight other dominance 
categories. 'Ihe remaining six words on the list were In tv70 categories, 
namely, table utensils and class days. Table 1 presents the forty 
words, their dominance classification, and category names. 



Insert Table 1 about here 



The experimental (n « 24) group learned via the paradigm as 
described. Timing was such that there were five sec. per word during 
presentation (via slides), there were sixty sec. of study during which 
the ^ studied his personally created study sheet, and there were ISO sec. 
(4 1/2 sec. per word) for writing on the tests. There were four 
repetitions of the sequence, or four trials. The of the first control 
group. Control 1 (n « 24), had exactly the same conditions except for 
their study sheets. Their study sheets were just a single column of 
40 spaces and they were Instructed to write the words on their study 
sheets In order as they saw them. Thus the only difference was that 
they did not have an opportunity to organize on their study sheets. 

The ^ of a second control group. Control 2 (n « 21), had essentially 
the same conditions except that they did not have a study sheet at all, 
nor did they have the sixty sec. of study time. They were given five 
trials, and for comparing this group with the others their performance 
(words correct and a clustering score) was linearly Interpolated so that 
comparisons were made at four points at which the three groups had 
equal times In the learning situation. 

Results for Experiment One The overall results in terms of number of 
words correct are depicted In Figure 1. An analysis of variance Indicates 



Insert Figure 1 about here 



a significant difference amongst groups (F « 8.171, df = 2,66, p < 0.001), 
a significant trials effect (F ** 473.02, df = 3,198, p < 0.001), and a 
non-slgnlf leant trlals-by-groups Interaction (F = 1.376, df « 6,198, 

0.20 < p < 0.25). Collapsing across trials and applying the Newman-Keuls 
method of Oi poht(ViioKL comparisons (Winer, 1962) to the resultant analysis 
of variance Indicates a significant difference (p < 0....Q1)* between 
the experimental group and each of the two control groups, but no signi- 
ficant difference (p > 0.05) between the two control groups. The means 
for total number of words correct are: Exper. grp. ** 131.04, Control 

1 « 119.25, and Control 2 *= 119.89. 



The 24 ^s In the experimental group divided themselves as follows: 
six of them wrote the words on their study sheets In the order In which 
they were presented, l.e., they did not organize (sub-group NO) on their 
study sheets. Five of the ^s alphabetized (sub-group AO), and the 
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TABLE I. The forty words used In Experiment One^ classified according to dominance level (Underwood 
and Richardson, 1955) and category. 
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remaining thirteen £s organized on the basis of subjective meaning 
(sub-group MO). For MO ^s the categories were Identified, in the post- 
experimental session, by such names as "food Items, utenclls, animals, 
sorority, structures, miscellaneous, made-a-story," etc. Collapsing 
across the four trials, an analysis of variance indicates significant 
differences amongst the total number of words correct for the three sub- 
groups (F " 6.76, df ** 2,21, p < 0.01), and the Newman-Keuls procedure 
Indicates significant (p < 0.01) differences between NO and AO, and 
between NO and MO, but no significant (p > 0.05) difference between MO 
and AO. The means for total number of words correct are: AO *■ 137.6 

MO « 133.8, and NO « 119.5. The six £s who showed no organization on 
their study sheets (sub-group NO) averaged fewer correct on each of the 
four trials, attaining an average of about 36.5 correct on trial four. 
Their learning curve shows very little negative acceleration across the 
four trials. By contrast the learning curves for the AO and MO sub-groups 
show sharp negative acceleration, and an obvious celling effect as they 
average about 39 (out of a possible 40) correct by trial four. The 
results are depicted In Figure 2. 



Insert Figure 2 about here 



The relations between proportion of Items correct and the character- 
istics of the E^deflned categories were examined separately for the NO 
and MO sub-groups of the Experimental group. The data for the alphabetlzers 
(AO) was not examined In this analysis since their mode of subjective 
organization specifically disrupts the ]B-deflned categories. Figure 3 
depicts the results for the four different kinds of ^-defined categories 
"built-in" to the list of forty words. The ordinate Is the proportion of 
Items correct (collapsed across all four trials) for each kind of category. 



Insert Figure 3 about here 



for each of the two sub-groups. The difference between high and lew 
dominance categories for the 13 of the MO sub-group Is significant 
(t o 3.09, df ® 12, p < 0.01), as Is the same comparison for the 6 ^s 
of the NO sub-group (t « 2.95, df « 5, 0.02 < p < 0.05). There are no 
differences (to the second decimal place) In proportion correct between 
words In low dominance categories and words unrelated according to the 
dominance norms. The words In the two "cue categories" (i.e., table 
utensils and class days) were correct for almost all of the ^s of both 
sub-groups from trial one on. Of the six Ss In the NO sub-group one ^ 
missed one of the six words on trial two. Of the thirteen ^s In the MO sub- 
group two missed the three table utensil words on trial two, while a 
third S missed them on trial three. Thus, the words of the "cue categories" 
showed essentially perfect learning from trial one through trial four. The 
words of the other categories showed "typical" learning curves across the 
four trials. 
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The degree to which words cluster on the tests may be examined 
by counting the number of times words In the same category are adjacent 
to each other In the test lists. This count of number of adjacencies Is 
referred to as jcepetltlons £bserved (RO). If the words on a test are 
rearranged so that all words In the same categories are adjacent to each 
other, and then a count made of the number adjacencies, the measure 
referred to as repetitions £osslble (RP) Is obtained. RP Is directly 
related to number of correct words on the test. It may be calculated 

k 

by the formula RP *» (m. - 1), where m. Is the number of words In the 

1«1 1 * 1 

1 category that appeared on the test and k Is the number of different 
categories represented by at least one word on the test. Fig* 4 depicts 



Insert Figure 4 about here 



one way of examining the relationships between RO, RP, and the other 
variables of the experiment. The RO variable Is on the ordinate, the 
RP variable on the abscissa, and the 45 degree line represents the locus 
of points deflned^ by-^^rfect, or total, clustering (RO = RP). 

Fig. 4a depicts trial-two data for the 13 ^s of the MO sub-group. 
The circles Indicate clustering scores based on the jE-deflned categories 
built Into the list. The Xs Indicate clustering scores based on the ^s 
own categories as defined on his study for that trial. Fig. 4b depicts 
the same things for trial four. When categories are defined by the Ss 
the cluster of points moves closer to the RO « RP line as learning pro- 
gresses. When the categories are defined by E the cluster of points 
moves further away from the RO *» RP line as learning progresses. Fig. 4c 
depicts this shifting In terms of group averages for the four trials. 

Once again the circles Indicate clustering scores based on the ^-defined 
categories and the Xs Indicate clustering scores based on the study-sheet- 
defined categories. The steady progression over trials towards perfect 
clustering for the study-sheet-defined categories, and the steady pro- 
gression over trials away from perfect clustering for the ^-defined 
categories make statistical analysis appear superfluous. The distance 
of each point from the RO « RP line (along a perpendicular to the RO - RP 
line) may be shown to be equal to (1//2)* (RP - RO) . An analysis of 
variance of the (RP - RO) measures supports the obvious In Fig. 4c. The 
trlals-by-deflnltlons Interaction Is significant (F « 28.8, df * 3,32*, 
p < 0.001). For the E-deflned categories the mean (RP - R0)’s for the 
four trials are 9.0, 15.1, 17.2, and 17.9, and the trials effect Is 
significant (F « 46.5, df « 3,36, p < 0.001). For the study-sheet defined 



*Wlth 13 ^s and. four trials, the df should be 3,36. RP - RO values were 
not available for four ^s on trial one (they wrote the words In the 
arbitrary order In which they were presented). Average trial-one values 
were used for these missing scores In the analysis, and four df subtracted 
In computing the error mean-squares. 
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categories the corresponding means are 6*2, 4.3» 4*5, and 2.2, and 
the trials effect Is also significant (F * 4.62, df - 3,32*, 0.005 < 

p < 0.01). 

Degree of clustering as a function of the character of the 
defined category of which the Items were members Is depicted In Fig. 5, 
In a manner analogous to that In Fig. 3 for proportion of Items correct. 



Insert Figure 5 about here 



These are the data for sub-groups MO and NO. The value on the ordinate 
of Fig. 5 Is defined as follows: for each _S, for each E-deflned category 

of words, the (RP - RO) measure Is summed across the four trials and the 
sum Is then divided by the sum of the corresponding RP measures. These 
proportions are the basic data for this analysis. They represent the 
distance from perfect clustering, collapsed across trials, relative to 
the maximum distance possible given the ^s particular performance on his 
four tests. In formula the ratio may be represented as: 



^ 4 ^ 

I(RP - RO) 

^ 4 tr ials ^ 
jT RP 




Z RP 



for each for each E;-deflned category of words. The values plotted 
In Fig. 5 are simply averages of these ratios across appropriate Ss, 

The ordinate scale Is Inverted so that ‘'more clustering" Is "higher" on 
the ordinate. There are no significant differences In clustering amongst 
the dominance-defined categories of words (not even one of the possible 
wlthin"^^ t-tests had a p < 0.10), despite the fact that high and 3vOw 
dominance words did differ In terms of proportion correct (see Fig. 3 
and associated analysis). These results are In essential agreement with 
those of Bousfleld and Puff (1964). The "cue" words, l.e., table 
utensils and class days, show essentially perfect clustering. 

The two control groups and the experimental sub-group NO may 
be examined ror total number correct (collapsed across trials) and 
for total (RP - RO) scores for the JE-deflned categories. There are 
no obvious differences amongst the three groups on either measure, 
but all three are different than the experimental sub-group MO on 
both measures. For the niimber correct measure the means are: 

119.25 for Control 1 (n ■ 24), 119.50 for sub-group NO (n ■» 6), 

119.89 for Control 2 (n » 21), and 133.85 for sub-group MO (n « 13). 
Differences In variances are not significant (0.05 < p < 0.10 via 
Bartlett's test), and an analysis of variance yields an F « 6.46 
(df “ 3,60) for between groups (p < 0.01). The Newman-Keuls procedure 
Indicates the MO mean different from the other three (p < 0.01), 



*See page 9 for footnote 
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but no differences amongst those three (p > 0.05). For the (RP - RO) 
measure the means are: 49.00 for sub-group ITO (n ** 6), 50.43 for 

Control 2 (n a 21), 51.75 for Control 1 (n ® 24), and 59.23 for sub- 
group MO (n a 13). Differences in variances are not significant (p > 0.05 
by F test), and an analysis of variance yields an F « 3.03 (df a 3,60) 

for between groups (0.01 < p < 0.05). The Newman-Keuls procedure Indicates 
the MO mean significantly different from the NO mean (p < 0.05), differences 
between the MO mean and the two control group means approach significance 
(p just slightly greater than 0.05), and differences amongst the NO, 

Control 1, and Control 2 means not significant (p much greater than 0.05). 

Discussion for Experiment One 

The opportunity to overtly organize (on study sheets) the material 
to be learned facilitated the learning of the material. Simply writing 
the words In their random orders of presentation did not facilitate learn- 
ing relative to the standard free-recall (no study writing at all) 
conditions. Further, only those ^s who took advantage of the opportunity 
to overtly organize the material (sub-groups MO and AO) were the ones to 
show more rapid acquisition of the material. Those S^b who were given the 
oopportunlty to overtly organize the material, but who failed to utilize 
this opportunity (sub-group NO) , showed acquisition performance indis- 
tinguishable from those ^s not given the opportunity to overtly organize 
(Control 1 and Control 2). The same general relationships amongst these 
groups and sub-groups are also true in terms of the clustering behavior 
(for the ^ defined categories) . 

Though the number of ^s who utilized alphabetic organization (n ■ 5) 
was smaller than the number who utilized organization according to sub- 
jective meaning (n « 13), the present experiment Indicates no significant 
difference In learning performance for the two modes of organization. 

The ^-defined categories based on the level of dominance definitions 
were. In general, not utilized by the ^s. Though the high dominance 
words were somewhat easier to learn than the low dominance and non-related 
(according to dominance) words, there was no difference in clustering for 
these different levels of dominance. Further, clustering performance In 
terms of these "built-in" categories actually Indicates a decrease in the 
utilization of these categories as learning progresses. The category 
definitions based on dominance are not "persuasive." ^s tend to ignore 
them, and appear to find other criteria for categorizing or organizing 
the words. Bousfleld and Puff (1964) report contrasting results. 

In contrast to the dominance-defined categories the two "cue 
categories" were highly salient and "persuasive." Almost all ^s showed 
perfect retention for these words from trial one on, and sub-group MO 
also showed essentially perfect clustering from trial one on. Sub-group 
NO and the two Control groups show a rapid Increase in clustering for 
these words, reaching essentially perfect clustering by trial two or 
three (this last observation Is just descriptive, l.e.. It Is not analyzed 
statistically and Is, therefore, not reported In the results section). 
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Thus, It possible for E to define categories which most £s will 
adopt in their subjective organization of material to be learned. 

The marked difference in subjectively defined clustering relative 
to ^-defined clustering leads to the obvious conclusion that ^s may be 
ignoring the E-deflned Independent variable. The present paradigm provides 
for a check ot. the extent to which this may be true, and for the develop- 
ment of ^-defined variables of varying and known degrees of "persuasiveness. 

It was originally thought that the study- sheet paradigm would pro- 
vide a "sneak look Inside the if you will, in the usual free-recall 
experiment. This is obviously not so. The opportunity to organize on 
their study sheets changes the ^*s behavior. For example, (1) no control 
^ alphabetized, but five experimental ^s did, and (2) experimental ^s 
got more words correct. It is proposed, however, that the experimental 
paradigm presented is no less interesting than the standard free-recall 
situation, or any other standard learning paradigm for that matter. 



Experiment Two 

Since the concept dominance variable was Impotent with respect to 
clustering or "organizing" behavior another ^-defined variable was sought. 
Further, generalization of the findings of experiment one required at 
least one other set of stimulus materials. The conclusion that the study- 
sheet paradigm was not providing a "look inside of" the standard free 
recall paradigm led to considerations for maximizing the usefulness of 
the overt organizing behavior of the ^s. Performance on study sheet 
one of the first experiment was essentially useless since ^s dldn*t know 
the total composition of the list to be memorized untix after they had 
completed that first study sheet. To eliminate this problem all ^s were 
given a "pre-look" at the total list of words to be memorized. This was 
done prior to the first trial only, and the words were randomly arranged 
so as to continue not suggesting any particular organization. Instructions 
remained "arrange the words on the study sheet to best help you memorize." 
The list was made longer in order to avoid the ceiling effect exhibited 
by most of the experimental ^s of experiment one by trials three and four. 
Control groups were eliminated and all Ss run under study-sheet conditions 
in order to maximize the quantity of study-sheet data on which ^s "organized 
according to meaning." This larger number of completed study sheets was 
used to develop a more complete objective procedure for establishing the 
subjective categories of each 

Method for Experiment Two . Fifty-eight ^s yielded the data of experiment 
two, all of them learning via the study-sheet paradigm. One subgroup of 
21 ^s had four trials. Two subgroups of 15 and 22 ^s each had five trials. 

The materials of experiment two consisted of 72 words chosen from 
those utilized by Marshall (1967) in the study previously referred to. 

The 72 words were really 36 pairs of words; the 36 pairs being divided 
into subsets of six pairs each. Each subset of six pairs differed from 
the next in terms of the range of the Mutual Relatedness (?iR) index 





between the words of the pairs. Thus the pair "man-wonan" was in the 
subset of six pairs having a high degree of NR, while the pair "minute- 
day" for example was at the opposite end of the continuum. The MR index 
is based on normative association data and reflects the degree to which 
each word of the pair elicits the other, and words in common, relative 
to all words elicited by both members of the pair. 

Further, each subset of six pairs was divided into two sub-subsets 
of three pairs each, one of the sub-subsets Involving pairs which might 
be called categorized, that is, each member of the pair could easily be 
incorporated within one category name, e.g., man and woman are both human 
beings. These are contrasted with the non-categorized pairs, e.g., spider- 
web or food-eat. The 36 pairs of words are listed in Table 2, arranged 
according to MR level, and separated into categorized and non-categorized 
groups. In the Marshall study previously referred to a different group of 
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^s was utilized for each MR level, with twice as many pairs of words at 
each level than was used in the present experiment. In the present experi- 
ment, however, all ^s were presented with pairs of words at all MR levels, 
thus yielding within ^ comparisons across MR levels, whereas the Marshall 
study yielded between S, comparisons across MR levels. Further, the 
Marshall study utilized the relatively standard free recall learning 
paradigm while the present study employed both a "pre-look" and the study- 
sheet paradigm. 

There was 11/2 min. for the pre-look. The presentation was at 
the rate of 5 sec. per word (six min. to present all 72 words once). There 
was 1 1/2 min. for study of the study sheet, and the test was timed for 
five min. Repeats of all but the pre-look made up the subsequent trials. 
Several different random arrangements of the words on a blank study sheet 
were used for the pre-look. All £s saw the same series of random sequences 
of words during the presentation periods of the successive trials. 

All pre-look, study, and test sheets were on 8 1/2" x 14" ("legal" 
size) sheets. The study sheets had 12 columns each about 1 1/16" wide, 
and 28 rows each about 1/4" high, thus outlining 12 x 28 cells each about 
1 1/16" X 1/4". For the pre-look sheet the 72 words were randomly 
assigned to 72 out of the 336 cells. The test sheets contained two long 
columns of numbered spaces, 1-36 and 37 - 72. 

The procedure for objectively establishing the subjective categories 
from the individually prepared study sheets involved the following: the 

first step was to search all study sheets to determine whether or not the 
words were written down in the order in which they were presented. If 90% 
or more of the words were written in the order in which they were presented, 
either by column or by row, then the study sheet was classified as an 
"order of presentation" study sheet, and for that ^ on that trial there 
was no information for determining subjective categories. Twenty-three 
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subject-trials out of a total of 269 fell Into this category. For the 
remaining 246 subject-trials the determination of which words the ^ 
intended to group together was made exclusively in terms of the geometry 
of filled and empty cells on the study sheets, that is, the particular 
contents of the cells was completely Ignored. Cells adjacent to each 
other were scored as "belonging together." The words in them, therefore, 
were scored as members of a single subjective category. The problem was 
one of determining whether it was to be horizontal or vertical adjacency 
that would be used in the scoring. Five pairs of measures were used for 
the determination. The first Involved a count of the ntmber of adjacent 
filled cells in going down each column, and then an equivalent count in 
going across each row. The second Involved a count of the number of trans- 
itions from a filled cell to an empty cell as the study sheet was examined, 
first column by column, and then row by row. The third measure involved a 
count of the number of words in the upper-most row of the study sheet which 
was utilized by the S^, thus defining a margin count for column organization. 
A similar count was made of the number of words in the left- most column 
utilized, thus finding a margin count for row organization. The fourth 
measure Involved examining for Isolated groups of filled cells, that is, 
filled cells surrounded by spaces and/or margins. A separate count was 
then made of the filled cells in these isolated clusters, first by column 
and then by row. The fifth and last pair of measures involved the variance 
of the counts going into the first pair of measures. The next step Involved 
inserting these five pairs of numbers into a somewhat complex logic program, 
and the outcome was the classification of the ^ into either a row organizer 
or a column organizer for that study sheet. After eliminating two ^s out 
of the original sixty for falling to follow instructions it was possible 
to build the logic ad hoc so as to successfully classify all 269 subject- 
trials Involved. The cross validation on an additional sample remains 
to be done, but success in 269 out of 269 cases is quite promising. Success, 
of course, is here defined in terms of agreement with Es’ judgments based 
on examining the study sheet and the contents of the cells. Incidentally, 
most ^s were column organizers and very, very few had any mixed modes of 
organization. 

Results for Experiment Two 

Despite large procedural differences between the Marshall (1967) 
study and the present one, the results are remarkably similar in terms of 
the ^-defined variables. Fig. 6 depicts the average number of words correct 



Insert Figure 6 about here 



for the categorized versus the non-categorized words in the list. In both 
studies categorized words were recalled better than non-categorized words. 
But the most Interesting finding common to both studies is depicted in 
Fig. 7. In this figure there is plotted, for each of the five trials, an 
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Fig. 7. (part 2 of 2) ^ Sxneriitenter defined test-sheet organization 
as a function of Nutual RelatedrAOss (MR) Level. Pairs of 
Categorized and Non-Categorized viords are equated for MR Level. 
The index for organization is the number of repetitions 
observed (BC) i/.inus the number of repetitions expected on the 
basis of chance (HC). For the three pairs of words at each 
point the rraximnirfi value of the index Is 2.0, the tnininium 
value is -i.O. and t*he chance value is 0.0. 
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Index of clustering, on the ordinate, as a function of MR level, on the 
abscissa; with the parameter within each graph being the distinction 
between categorized and non-categorized pairs. The index of clustering 
is the average number of repetitions obserted (RO) , minus that number of 
repetitions (RC) which would be expected on the basis of chance alone 
given the particular set of words the ^ produced on his test . This 
adjustment for chance is based on Bousfield*s formula as reported by 
Dallet (1964). The formula is: 



where n. Is the number of words In the i*’’ category which appeared on the 
test, ^The derivation and extension to the present case Is Included In 
Appendix B. The relationships depicted in Fig. 7 may be summarized as 
follows: beyong trial one the figures consistently indicate little or 

no distinction between categorized and non-categorized pairs 
highest MR level, with a widening distinction between the two kinds 
of pairs as MR level decreases to the low end. These results agree 
very nicely with those reported by Gofer (1965) and Marshall (1967) , 
despite the fact that a different index of clustering was used in that 
experiment, and despite all of the procedural differences 
the two experiments. However, an additional note to keep in mind 
respect to Fig. 7 is the movement of the lines from test to test relative 
to the zero or chance line, and relative to the 2.0 or maximum possible 
upper limit. In general, the low MR pairs, particularly the non-categorlze 
ones, cluster less than would be expected by chance, and this negative 
cluster score actually increases in magnitude from test to test. It is 
somewhat offset by corresponding increases relative to chance for the 
words at the higher end of the MR scale, in particular the categorized 
words. However, note that nowhere is a point to be found above 1.0 on 
the dependent measure scale, with a value of 2.0 being the score t at 
would be obtained if perfect clustering occurred. As a matter of fact, 
if the data are examined in terms of a slightly different measure, n^ y 
the one used in Experiment One, l.e., the number of repetitions possible 
(RP) minus the number of repetitions observed (RO), one finds that as 
learning progresses the difference actually grows. Thus, collapsing over 
the MR levels, the overall E-defined clustering actually decreases with 
learning. This is not true for subjectively defined clustering. The 
results are very similar to those for Experiment One as depicted in Fig. c. 

Fig. 8 depicts some of the characteristics of the subjective 
clustering as measured on the study sheets. The large upper plot s mp y 
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f ^or<j3 involved in o-f use. 





Fig. 8. Frequency of use and number of words involved as a 
function of dategory size (subjective categories 
defined via study-sheet performance). Data are shown 
for trials one through four. 
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gives the average frequency of use of subjective categories of various 
sizes, for trial one and for trial four. The three smaller insert plots 
trace the average frequency of use across four trials, and do so for 
isolated words and for categories of two through eight words. The number 
of isolated words on the study sheets decreases as a function of trials. 

All subjective categories of two through eight words show increased 
average frequency of usage as a function of trials. The dip in frequency 
of usage of categories of three words is believed to be due to the pairs 
of words ''built-in” to the stimulus list. The lower two plots of Fig. 8 
have as their dependent measure the average number of words involved in 
the categories of various sizes, as specified along the abscissa. The 
dependent measure is simply the average frequency of usage (of the upper 
plots) times the size of the category. (Once again the dip for categories 
of size three is believed to be due to the built-in pairs of words. A 
similar plot for the data of Experiment One, in which the minimum built- 
in category had three words, shows a dip for categories of two words.) 

The results for four trials are depicted in the two plots. The jieans of 
the distributions move toward larger categories as a function of trials. 

By trial four approximately half of the 72 words, on the average, are to 
be found in subjective categories containing from four through seven words. 
Additional calculations indicate that from trial two on the average number 
of subjective categories used by each ^ is approximately fifteen for these 
72 words. 

Fig, 9 depicts the relation between the degree to which individual 
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^s utilized their own subjective organization (on the abscissa) and the 
number they get correct on a test (on the ordinate) . The independent 
variable is the same index used previously, namely, repetitions observed 
minus repetitions expected by chance (RO - RC). However, the clusters 
are here defined in terms of the ^s' own study sheets. Several things 
are obvious from the scatter plots. First, ignoring for the moment those 
points falling on the chance (or zero) value of the independent variable, 
a relatively strong correlation is depicted between the two measures. In 
trial two, for example, this correlation approaches 0.8. In trials three 
and four, an obvious ceiling effect is present and the correlations conse- 
quently decrease. The points at the zero or chance clustering line are 
primarily for those ^s who showed no scoreable organization on their study 
sheets, that is, they were order of presentation arrangements. It is 
fairly obvious that as learning takes place (test 1 to 2 to 3, etc.) the 
numbers of these points decrease quite markedly. In general, proceeding 
from trial one to two to three to four the points move from left to right 
indicating a growth in subjective clustering as learning takes place. 

Thus, we have number correct associated with degree of clustering across 
^s, and number correct associated with degree of clustering within £s 
across trials. In summary: the more a £ utilizes his own subjective 

clustering in ordering the words on his test, the better he is likely to 
do in terms of number correct. 
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Another way of measuring subjective organization Is In terms of 
the consistency of word order from one test to the next for each 
Tulvlng (1962) and Carterette and Coleman (1963) utilized what Is referred 
to as the SO measure for this propose. The present analysis utilizes 
the Kendall Tau coefficient, a measure very similar to the rank order 
correlation coefficient. 

Fig. 10 depicts frequency distributions or hystograms for the 
Kendall Tau coefficients calculated between adjacent trials. The distri- 
butions are Indicated separately for those Ss who organized In terms of 
the meanings of the words (the ”a” hystograms) . The "b" hystograms 
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Include those ^s who wrote words on their study sheets In the order In 
which they were presented for at least one trial, and the ”c" hystograms 
are for those ^s who utilized alphabetic organization for at least one 
trial. Thus, there are three hystograms for each adjacent pairs of trials. 
In general there Is no marked movement toward higher Tau coefficients 
with learning, except for the "b” hystograms In going from trial one-two 
to trial two-three. Even for the last pair of trials (Fig. 10) 
the Tau coefficients are generally moderately positive but not very 
impressive. Some alphabetlzers exhibit near perfect order co-relatlons , 
that Is, Taus near 1.0, but there are also alphabetlzers with lower Taus. 
Some alpha organizers simply cluster In terms of common first letters of 
words making no attempt to order these clusters alphabetically on their 
tests. These ^s account for the low Tau coefficients among the alpha- 
betlzers. In summary: a tendency for ^ to write words In the same order 
from trial to trial shows no growth as learning takes places, with the 
possible exception of a very small sub-group of alphabetlzers who fully 
utilized alphabetical organization. It Is Important to note, of course, 
that the Kendall Tau coefficient Is not the same Index of consistency of 
sequence that Tulvlng (1962) and Carterette and Coleman (1963) used. 
Consequently, the results are not directly comparable. 

Is there an optimum number of words for a category In order to 
minimize errors? Fig. 11 examines the relation between proportion of 
words In error and the size of subjective category In which those words 
were found. There Is a subfigure for each of the five trials. Category 
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size on the study sheet Is plotted on the abscissa. On the ordinate 
there Is a ratio, the numerator of which Is the proportion of words 
missed (P.) on the test for the subjective category of size 1, and the 
denominator Is the size of the category 1. P^ for the category of size 
two (Po) » is found by tabulating, for each S_ on a given trial, 

the number of words In two-word subjective categories which were missed 
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Fig. 11. Proportion of words missed for a subjective category 
of size i (Pi) is approximately equal to a constant, 
reflecting tne overall average error rate (P), times i, 
i.e., Pi (P)*(i), or (Pj^)/(i) ^ P. There appear to 
be no ’’optimum” subjective category sizes. (0.0. P. 
stands for Order Of Presentation) . 
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on the corresponding test. This number wrong Is then summed across £s 
yielding a total number wrong for category size two (VL) . Also tabulated 
Is the number of different categories of two words used by each This 
number-of-tlmes-that-categorles-of-slze-two-are-used Is then summed 
across ^s to yield a count of the total number of times a category of 
size two was used (U 2 ). equals the ratio of (W 2 /U 2 ). Dividing ?2 by 

2 (In general, by 1) is. In effect, multiplying U 2 by 2 (or, in general, 

times 1), The resultant ratio, the one on the ordinate of Fig. 11, Is 
one of number of words wrong divided by number of words Involved. The 
general picture In Fig. 11, It Is proposed. Is one of a horizontal line 
with a fair amoimt of noise. Number wrong divided by number of words 
involved Is a constant (P) across the different category sizes, 1, 
where the constant (P) Is simply the overall average error rate. The 
implication Is, simply, that there appears to be no "optimum" subjective 
category size* If there were, one should find a dip In the curves, con- 
sistent across trials. In the region of the optimum value of 1. While 
these findings are not to be directly compared with those of Dallet (1964), 
for example, because of the distinction between subjective category size 
and experimenter-defined category size, they are indirectly In support 
of the conclusions reached by Cohen (1963 and 1966) concerning a constant 
proportion of category recall across category sizes. 



New Measures of Subjective Clustering 

Additional measures have been developed for characterizing several 
additional aspects of ^s* subjective organization. The first of these 
measures (CON) provides an Index of the cons istency of an ^*s subjective 
organization from one trial to the next. The second measure (STR) pro- 
vides an Index of stereotypy of organization, or the degree to which words 
are grouped In the same way— across ^s. The third measure (CCP) is similar 
to the Stereotype measure, but Is designed specifically for examining 
selected pairs of words of particular Interest to It reflects the 
proportion of ^s who put both words of the pair Into the same subjective 
category, with the proportion "adjusted," In effect, for the sizes of the 
categories Involved. The mnemonic stands for "common categorizing of 
£alrs . " 



All three measures utilize the same basic concept which Is, In 
essence, a "common elements" definition of the square of the correlation 
coefficient. Fig. 12 Illustrates the basic form of the measure. Consider, 



Insert Figure 12 about here 



e.g., Roman numerals I and II as trials I and II. A ^ grouped or clustered 
words Bj & D on trial I, leaving words A and E as Isolates. On trial 

II his organization of the words shifted as Indicated, l.e., A and B were 
In one group, and C, D, and E In another. Thus, each word appears In 
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Fig. 12. Illustration of general form for index of 
Consistency (CON) and Stereotypy (STR): a 
’'comnion elements” definition of r^. Index 
of Common Categorizing (CCP) for a pair of 
words is based on the proportion of ^s who 
put both words in the same category, with 
the proportion ’’adjusted” for size of 
categories. 
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two groupings or clusters, one for trial I and the other for trial II. 

We may now look at each of the words, one at a time, and ask about the 
overlap of the groups In which they are to be found. The two groupings 
containing word C, e.g., have two words or elements In common, l.e., 
words C and D. Thus, the number of common elements for the two groupings 
Is two. This number goes Into the numerator and Is squared. The 
denominator simply contains the product of the number of elements in each 
of the two groupings which contain C. The result Is 

This Index Is calculated for each of the words which appear on both lists. 
These Indexes are labelled through Illustration 

In Fig . 12 • 



For the consistency measure (COM) the Index Is simply averaged 
across all of the words of Interest. This may be for particular pairs or 
groups of words (e.g., ^CDE(I)(II) Illustration of 

Fig. 12), or for all of the words on the list. To date only the latter, 
or overall, measure has been examined In detail. 



The sterotypy Index (STR) Is also very simple conceptually, though 
It takes a computer to accomplish the very large number of calculations. 
Returning to Fig. 12 consider Roman numerals I and II as representing a 
pair of £s on a given trial. Instead of two trials for a given £. This 
Is the basis of the Index. The computer program then calculates one such 
Index for each of the words, for every possible pair of £s (the program 
will handle up to 20 ^s at a time), and then averages across all of the 
pairs of £s. These averages may then be examined for Individual words, 
or further averaged across pairs or sets of words. 

The Index of the common categorizing of pairs of words (CCP) Is 
based on a count of the number of words In the categories within which a 
particular pair of words (e.g., A and B) may be found. For a given trial 
(study sheet) this count Is made for each and summed across the set of 
N £s. The square of the sum goes Into the numerator of the Index. The 
denominator contains the product of two sums, one for the number of words 
in the categories within which word A Is found and the other the 
corresponding sum for word B. 

Details and Illustrative computations are contained In Appendix A. 



In the Initial evaluation of the three new measures It was necessary 
to find twenty ^s with appropriate data. The study-sheet paradigm does 
not guarantee useable data for every S^. Only those ^s could be used who 
exhibited some form of "meaning” organization, that was scoreable, on 
every trial. It was necessary to delete from consideration two ^s for 
not following Instructions, two S^s who adopted an alphabetizing strategy, 
two ^s who wrote the words on their study sheet In the order In which 
they were presented for one or more trials, and three ^s for whom the 
scoring rules failed on one or more trials (Indicating more than 1/2 of 
their words were Isolated single words). Thus these Initial analyses 
are based on the first 20 (out of 29) "good" ^s. Similar analyses with 
additional ^s are In progress. 
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Results of Experiment Two with Respect to the New Measures of Subjective 
Clustering 

Fig. 13 is a series of four scatterplots for the 20 Ss, relating 
the index of consistency (CON) and the number of words correct. The plot 



Insert Figure 13 about here 



in the upper left, i.e., trial one to two study sheet consistency on the 
abscissa and number correct on the trial two test on the ordinate, depicts 
the only significant r. r « 0.54 (.01 < p < .0/J^It appears as if 
number correct is correlated with consistency of organization only for 
performance very early in learning. There is a suggestion in the plots 
that a ceiling effect may be washing out the correlation in the later 
trials. Fig. 14 is included to argue against that interpretation. The 



Insert Figure 14 about here 



dependent variable is the same, i.e., the number correct. The independent 
variable is the index (RO - RC) which reflects the degree to which the 
individual £s study sheet organization is reflected in his grouping of 
words on his test. The celling effect as trials progress does appear to 
be reducing the r's, but the r's are strong at least through trial three 
(r ■ 0.74, p < .01). The data depicted in Fig. 14, incidentally, very 
nicely replicate prior data on different £s. For both Fig.'s 13 and 14 
one may see a movement of the points to the right (and upward, of course) 
as trials progress. Consistency of study-sheet organization increases with 
trials, as does the degree to which the test organization reflects the 
study sheet organization. But, study jheet consistency appears to be 
related to number correct, across Ss only very early in learning. 

The relationships between stereotypy (STR) and the E^defined 
variables of MR and categorization are depicted in the upper series of 
plots of Fig. 15. Stereotypy for the categorized pairs of words was 
consistently above that for non-categorlzed pairs. There is a general 
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downward trend in stereotypy in going from Hi MR to Lo MR, though there 
is a slight upturn at the two lowest MR levels. Stereotypy seems to 
first increase and then decrease across trials. 

The relationships between the index of common clustering (CCP) for 
the ^-defined pairs and the E -defined variables of MR and categorization 
are depicted in the lower se^'xes of plots of Fig. 15. The relationships 
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Fig# 13* Number correct as a function ol' consistency of organization (CON)* 





.m 






— ( 

1 




■ 

« 

• 


• < 


’ . 


» 




V ' 

H ' 

k. < 


V 

V 
0. 




• 

• 

• 


•• 

• 


• 














• 














t 




m 




L 










li"' 

, — It 



W M 

No. Correct 



n 

• 




• 




In A 

A ^ 


•T 

• 4 


‘I 


i 




^ w i 

« < 

( 


r» 


f 


i 

) 












• i 

A. 


e 










« 






1 


lo 


.L. 

t%.t 1 




1 


% 



O 



^ Co 

cc: 

o I 

•> 

c:) 

o QC 



/^o. Correct 













( 

- K 


« 

0 


i 


‘e 








VI 

II 

V. 


V 

a. 


■ ■■■ 1 




;• 


• 

• 

,« 


• 












1 

•• ; 


"“1 

1 


• 

1 














• 




iTM t 


% 






i 





/\Jo, Correct 



« 










0 

o* 

A 

ft 


1 


» 

. 

1 


• 

• 


1 


i 

« 

w - 


**■* 

A 

M 

S 


«— 

• 

dk . . .. 


• 










• 










4 


» 


• • 






N ! 




1 


— 


•5 



cc 

I 

O 

q; 



A/o. Correct 



I 

o 

Pi 



g 

•H 

■P 

4 

b) 

•H 

g 

tH 

u 

o 

o 

> 

•H 

€> 

•o 

M 

O 

i) 

«> 

M 

U) 

• 

*o 

o 

c; 

o 

•H 

P 

O 



V* 

cd 

0 ) 

cd 

p 

o 

€> 

h 

U 

O 

O 

P 



H 

P 



ierIc 



32 ^ 




oc 

K 

<o 



- 33 - 



o, 

o 

o 





q: 

O 



o 



•p -p 
£4 £5 



0> 

I TJ 

C3 

O 

P. 

9 ) 



^ <l) 

fU h 
O © 

O <P 
w tp 
•H _ 

w »o *d 

p 

•H -P © 
© ^ -^5 
Pi E-t 
w 

(hTJ 
O P 
O 
bfl ? 

P 

•H ^ *H 
NOP 
•H -P 
p n 
O P«H 
th*H O 
Q) a 
■p Pi P 

P o 
o 



QC 

% 

s: 



» 

to 

H 

as 



P 

o 

« 

k3 

o 

o 



u.§ 



•o 

© 

N 



Vi 

cd 



'd 
P -H 
p p 

o 



© 

o 

p 

as 

P 

o 

<p 

p 

© 

p, 



fp 

EH 

10 



N 
•H 

P 

tu> 

P 

o 

«H 

o 

•d 

© 



-p. 

© 

p 

o 



KTi 



Vi 

as 

4J 
»• © 
hu © © 

© fH -P 

4^ © © 

cJ »> 

U © 

I H 
P P 
O O- 
•Hiz; 

•P p: 
as 'd • 

Ojv--’ 

tj 

-S w © 

w w 
© d 
P^ 
»d 
© 

•p 
d 

P, N © to 
>»*Hpc: © 
■P P H 
o Oi-i;a 
© tyo d d 
P © P*H 

© +> -P P 
-p d P d 
cQorp > 



1 



© 

p 

d 



tkO 

•H 



er|c 



are more orderly than for the STR measure, and nicely parallel the 
relationships depicted in Fig. 7 for the (RO - RC) measure. The difference 
between categorized and non-cat egorlzed pairs 5s smallest for high MR 
values, and CCP decreases with decreases in MR. These results also 
parallel those of the Marshall (1967) study referred to earlier. 

Examination of individual pairs of words, however, indicates that there 
are still some ”blg chunks of variance" to be accounted for. Some of 
these are indicated as dotted lines in the bottom portion of Fig. 15. 
Considering the lowest MR plot, the dotted line that reaches 1.0 represents 
the word pair "head-toe." The very low dotted line is for "sclssors- 
needle." Both of these word-pairs are members of the categorlzed-lowest 
MR group. There is more difference between these two pairs than there is 
across the entire MR range for categorized pairs (though, of course, these 
extreme pairs were picked to exaggerate the point) . It turns out that 
almost every ^ adopted a "parts-of-the-body" subjective category. Perhaps 
this was triggered by the higher MR pairs "arm-leg" and "foot-knee" which 
were also in the list. At any rate, most ^s put "head" and "toe" in the 
same subjective category despite the low MR value relating them. In con- 
trast, "scissors" frequently went into a subjective category with "hammer" 
and "pliers" and/or "dagger," while "needle" often went with "silk," 
"dress," "glove," and "cloak." Very few ^s put "scissors" and "needles" 
in the same subjective category. These effects of the total context of 
the list ca nnot be Ignored if these "big chunks of variance" are to be 
accounted for. 



Discussion for Experiment Two 

A completely objective procedure for defining the subjective 
clusters of Ss during learning has been worked out. This procedure 
depends completely on the geometry of the filled and empty cells on a 
study sheet. Judgments concerning "what £s Intended" are not necessary. 
Though the objective procedure has been worked out It is time consuming 
and expensive. It Is suggested that much of this time and expense can 
be saved with very minor modifications to the study sheet paradigm. 

Instead of giving the ^s complete freedom as to how to organize on their 
study sheets, the ^s are to be Instructed to place those words which they 
wish to go together Into the same column. With an Illustration, but with 
care to avoid suggesting any particular organization, it should be possible 
to get Ss to continue to "organize the words on your study sheet so as to 
best help you to memorize." Assuming that ^s follow instructions it will 
then no longer be necessary to go through the expensive and time con- 
suming objective procedure for determining whether a particular S organized 
by columns or by rows. 

The two E-deflned variables which were built into the stimuli, l.e., 
the MR strength of word-pairs and the distinction between categorized and 
non categorized word-pairs, were quite potent with respect to subjective 
clustering. This was true in terms of the (RO - RC) measure, the STR 
measure, and the CCP measure. Word pairs id.th high MR strength are 
salient and persuasive. Most £s utilize those pairs in their subjective 
organizations. Over all MR levels however, the E-defined categories show 



a relative decrease In frequency of usage as learning progresses when 
compared with the frequency of usage of the subjectively defined 
categories. Even for the highest MR levels clustering behavior In 
terms of the E-deflned categories never comes close to complete or per- 
fect clustering. 

In terms of the E-deflned variables built Into the stimuli there 
is strong agreement across the different measures of this study, namely; 

(RO - RC), CCP, and to a considerable extent SIR. These general results 
also agree to a very marked extent with the comparable results of the 
Marshall (1967) study despite wide differences In procedure, measurement, 
design, and ^ population. 

^s use an average of approximately 15 clusters for these 72 words. 
Clusters get larger with trials. Most of the 72 words are contained In 
the clusters of from three to eight words each. There Is no optimum size 
of a svubjectlve cluster In terms of number of errors made. It Is also 
proposed that there is no optimum number of categories either. This Is 
In contrast to the report by Handler (1967) * A preliminary examination 
of the data of the present study, via scatter plot. Indicates no relation 
between the number of categories used and the number of items correct. 

A careful evaluation of these data Is yet to be made. "Goodness" of a 
subjective chunk does not depend on the number of Items In It. And, the 
total list may be thought of as one big chunk with the categories within 
It as Its elements. In the conditions of the study-sheet paradigm Ss 
manipulate the word elements until they achieve subjective chunks which 
are approximately equally "good" subjectively. "Good" of course means 
easily learned and (yet to be evaluated) remembered. 

^s whose tests reflect their study sheet organizations are the ones 
who learn fastest. The degree of correspondence between study and test 
organization Increases with trials for almost all Sjs. 

Consistency of word order from one test to the next, as measured 
by Kendall's Tau, Is only moderately positive and shows no consistent 
growth over trials. Consistency of organization from one study sheet to 
the next does show growth over trials, but for any one trial It is related 
(across ^s) to number correct only very early in learning. 

The three new measures developed permit measurement of aspects of 
subjective organization which have not been measured before. The stereo- 
typy and common categorization measures have been shown to have a kind of 
concurrent validity In terms of the jE-defined built In variables. 

Validity for the consistency measure is not as clearly established as yet. 



The common categorization measure has pointed out the need to take 



the total list context effects Into account in accounting for subjective 
organization behavior and, by Inference, learning. These effects are a 
big source of variance. 



Experiment Three 

The stimuli for this experiment were designed to be very salient 
and persuasive with respect to organization. Exhaustive categories of 
words were used, e.g., north, sough, east, and west, or mother, 
sister, and brother. Most of the words were taken from Cohen s (1963) 
report. The complete set of stimuli Is shown In Table 3. Original plans 
were to continue to utilize the study-sheet paradigm In order to check 



Insert Table 3 about here 



on the persuasiveness of the ^-defined categories. However, a pilot 
study Indicated that the categories were Indeed very persuasive, and 
that most £s learned most of the 70 words In just over one trial. For 
these reasons the study-sheet paradigm was abandoned for this experiment. 

It was assumed that almost all Ss would utilize almost all of the E-defined 

categories. 

Data reported by Cohen (1963 and 1966) Indicate that If an exhaustive 
category Is remembered at all (l.e., at least one member of It) then most 
of that category will tend to appear on a free recall test. From this, 
one may reason that the total of Individual Items of an exhaustive category 
"come out" because of the pre-experlmental history of S, and only one 
Item, or the name of the category, need be learned during, the experiment. 

If one Item Is eliminated from each of the exhaustive categories, however, 
then S must also learn during the experiment which Item of each category has 
been left out. Hence, If other things could be kept equal, one would 
predict "the shorter" list, l.e., the list with one Item missing from 
each category, should be the more difficult list to learn. But it Is not 
possible to keep "other things equal:" a) If one eliminates one Item 
from each category then the stimulus list has been shortened by as many 
words as there are categories In the list. Shorter lists are easier to 
learn (In terms of proportion correct) than longer ones, but the 
categorization effect predicts that the longer list will be easier. The 
eventual direction of the difference will then be a function of which of 
the two factors Is more potent, b) If an attempt Is made to keep the 
lists the same length then categories must be added to the Incomplete- 
category-llst, and then the category effect would be confounded with 
number of categories. Under these conditions both effects wuld be 
expected to operate In the same direction. Hence, alternative (a) was 
chosen. It was hypothesized that the category effect would be more potent 
than the difference between a 70-item list and a 50-item list thus pre- 
dicting the 50-ltem "incomplete-category" list more difficult to 
This is a dramatic, counter-intuitive prediction: a given list will be 

easier to learn than the same list with 29% of the words removed. 

Since there really was very little evidence to support the prediction 
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TABLE III. Stimulus words for Experiment 
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concerning the relative potency of these two effects, a third condition was 
also Included In the experimental design. For this condition a mixed list 
was presented to the ^s, providing a within-^ comparison. Ten out of the 
20 categories In the list were left complete and the other ten had one 
Item removed from each of the categories. 

Method for Experiment Three 

A 70-word list was constructed which contained 20 exhaustive 
categories, ten categories with three words each and ten categories with 
four words each. (See Table 3). A group of 20 Ss learned this list. A 
second list was generated by removing one word from each of the 20 categories 
of the original list, thus leaving a 50-word list. A group of 22 ^s learned 
this list. The third list had one word removed from each of ten of the 
twenty categories, thus making for a 60-word list. A third group of £s 
(n B 30) learned this list. For all three groups words were presented In 
random orders two to three seconds per word In a standard free recall 
paradigm for six trials. Five minutes was allowed for each free-recall 
test. The particular words eliminated for the 50- and 60-word lists are 
Indicated In Table 3. 



Results for Experiment Three 

The percent of words correct for the complete and partial categories 
was the dependent measure. Results are depicted In Fig. 16. The definition 



Insert Figure 16 about here 



of "complete" and "partial" Is obvious for the 60-word list. For the 70- 
word list all categories were complete, but the (partial) refers to those 
categories which were partial for the 60-word ^s. The "complete" were, 
of course, the same for both the 70- and 60-word lists. The (complete) 
and partial categories of the 50-word lists are defined In a similar way, 
l.e., with reference to what happened to those categories on the 60-wcid 
list. 



The between-^s evaluation of the major hypothesis of the experiment 
Involves a comparison of the overall percent correct on the 50-word list 
with the equivalent measure on the 70-word list. Contrary to prediction 
performance on the 50-word list was higher than on the 70-word list, l.e., 
the two dashed curves, combined. In the right hand plot of Fig. 16 are 
higher than the two dashed curves, combined. In the left hand plot. If 
both categorization effect and length of list effect were operating, 
apparently the length of list effect was more potent. 

The solid lines of Fig. 16, which are repeated In both left and right 
hand plots, depict the performance of the 60-word group. The percent 
correct for the complete categories (x's with solid lines) Is significantly 
greater (p < 0.01) than for the partial categories (circles w:!ith solid lines). 
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Thus the within Ss comparison does reveal a significant categorization 
effect, l.e., coijplete versus partial comparison. When these subsets of 
categories are compared for the 70-word and the 50-word ^s there Is no 
significant difference (p > 0.05) between them. Thus the difference found 
for the 60-word group ca nnot be ascribed to the particular subsets of 
categories that were used as partial and complete. The Interaction between 
trials and the difference between partial and complete is also significant 
(p < 0.01) for the 60-word group. An examination of the means, however, 
indicates no orderly progression for the differences between complete 
and partial across trials. The equivalent Interactions for the 50-word 
and 70-word groups were not significant (p > 0.05) 

The overall percent correct for the 50-word list (all categories 
Incomplete) was slightly higher than that for the 60-word (mixed) list. 
However the overall percent correct comparison for the 60-word list with 
the 70-word list (all categories complete) indicates essentially identical 
performance for the two lists (groups) . Comparisons between a mixed list 
condition (one half of the categories complete and one half Incomplete) 
and the pure list conditions (all categories complete or all categories 
Incomplete) confounds the mixed list versus pure list effect with the 
complete versus partial category effect, and both of these are confounded 
with number of words In the list. The Interactions amongst these effects 
may be quite complex. However, the comparison of the 60-word and 70-word 
lists (left plot In Fig. 16) provides some Interesting suggestions. 

Overall performance In terms of percent correct was essentially equivalent 
for the two lists. Yet, an examination of tne complete versus partial 
category subgroups of the list shows: first, that there Is no difference 

between these subgroups for the 70-word list (all categories complete) ; 
second, the identical categories (those which were complete) In the 60- 
word (mixed) list were easier to learn In the mixed list than In the 70- 
word list; while third, the Incomplete categories of the 60-word (mixed) 
list were more difficult to learn than their complete versions appearing 
In the 70-word list. It Is as If the ^s of the 60-word (mixed) list 
conditions concentrated first on the complete categories and only after 
obtaining some mastery of these did they switch their attention to the 
Incomplete categories of the list. In the 70-word list there was no 
such differentiation amongst categories and both subsets of categories 
were learned with equal speed. The effects of mixed versus pure lists 
cannot be Ignored In the evaluation of the basic hypothesis of this 
experiment. 



Discussion for Experiment Three 

The complete versus partial category effect has been shown for a 
mixed list, but the magnitude of the length of list effect (70 versus 
50 words) has apparently swamped whatever category effect might have 
been operating in the between ^s comparison. A test of the major hypo- 
thesis with a between ^s comparison calls for lists which differ much 
less in total lengths. Going from a list of 70 words to one of 50 wordj 
Is reducing the list length by 29 percent. Perhaps a reduction of only 
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five to 10 percent In list length would enable the category effect to be 
more potent and lead to positive results. In support of this (If one Is 
willing to Ignore the possible confounding effects of mixed versus pure 
lists) It may be noted that In going from 70 to 60 words, list length 
was reduced 14%, and there was no difference In overall percent correct 
for the two conditions. Several variations on the present experimental 
design may be suggested. One variation simply Involves utilizing exhaustive 
categories with larger numbers of Items, e.g., days of the week, months 
of the year, major cities In the state, etc. Removing just one member 
from each of these categories would reduce list length by a relatively 
small percentage. Another approach might Involve an Incomplete category 
list with one Item per category deleted, thus deleting e.g., t va Items 
from the list. The complete category list for comparison would have all 
but one Item eliminated from enough of the categories so that the same 
total number of Items, e.g., ten, were deleted. Thus, the comparison list 
would be of equal length In terms of total number of Items • and would 
have the same number of categories represented. The difference would be 
that the first list would have, e.g., ten Incomplete categories In It 
while the comparison list might only have one or two Incomplete categories 
In It (these categories containing only one Item each). 



Conclusions and Implications 



An experimental paradigm has been developed which enables one to 
study the subjective organizations which ^s Impose on material that they 
are assigned to learn. Measures have been developed, based on the data 
provided by this paradigm, which permit evaluating the degree to which ^s 
utilize their own subjective orgeualzatlon during learning, the degree to 
which subjective organization Is consistent from one trial to the next, 
and the degree to which subjective organizations are stereotyped across 
Ss» Three experiments, exploring various aspects of the organization of 
material during learning, permit the following general conclusions. 

Giving £s the opportunity to overtly organize the material to be learned 
facilitates learning. Given the opportunity, those ^s who utilize It 
learn more than those who do not. The performance of those ^s who are 
given the opportunity to overtly organize, but who do not utilize that 
opportunity, is essentially i.ndlstinguishable from other who were 
not given the opportunity to overtly organize the material. 

Redefined categories of words, based on free-assoclatlon normative 
data, vary considerably In the degree to which Sa perceive and utilize 
them In their subjective organization of the material. The concept 
dominance variable (Underwood and Richardson, 1956) Is very weak In terms 
of persuading ^s to utilize the categories so defined in their subjective 
organization. The mutual relatedness variable (Gofer, 1965, and Marshall, 
1967) Is much more salient and persuasive. It covers the range from 
highly persuasive to not persuasive at all, but even at the highly persua- 
sive end it Is clear that these ^-defined categories are still leaving 
large portions of variance unaccounted for in the subjective organization 
behavior of ^s. The context of the total list of Items to be learned must 



be taken Into account before portions of this unaccounted-for variance 
will be understood. 

When ^s subjectively organize material there appears to be no 
optimum size of subjective cluster for minimizing error. It Is proposed 
that when S,s develop unrestricted subjective categories they are In 
essence manipulating the elements of the list until they achieve sub- 
jective chunks which are, for them, equally "good." "Good" In this con- 
text means easily learned. It is proposed that "good" may also be Inter- 
preted to mean "remembered" but retention data are not yet available. 

Though S^s may strive to reduce all subjective categories to equal 
"goodness," there are characteristics of stimulus materials which make 
this relatively Impossible. This difference In materials Is ascribed 
to the very lengthy and extensive pre-experlmental history which the _S 
brings with him to the experiment. The exhaustive categories of experi- 
ment three are highly familiar to almost all ^s and almost all utilize 
them fully In their subjective organization of the material to be learned. 
However, an exhaustive category with one Item missing Is not as "good" 
as the corresponding complete exhaustive category. For the complete 
category ^ may utilize, without modification, his pre-experlmental 
history. For the Incomplete categories, however, ^ must add something 
to his pre-experlmental learning In order to utilize it In the experi- 
ment. 

Alexander and Huggins (1964) report on the use of an approach 
similar to the study-sheet paradigm for a perception experiment. Their 
results are also quite encouraging. 
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Summary 



When presented with the task of learning meaningful verbal 
material (and other forms of material, as well) most ^s cluster or 
organize the Items to be learned Into subjectively meaningful groupings* 

An experimental paradigm (the "study sheet paradigm") has been developed 
which permits the objective measurement of this subjective clustering 
during learning* The unique subjective groupings for each Individual 
^ may be Identified on each learning trial* The paradigm Incorporates 
the use of specially prepared study sheets which each £ prepares for 
himself on each learning trial* The Information on these study sheets 
provides for the determination of the subjective clusters* Test per- 
formance may then be examined w?lth respect to these subjectively defined 
clusters * 

In addition to describing the study sheet paradigm the present 
report describes two experiments employing that paradigm, several newly 
developed measures for previously unmeasured aspects of the subjective 
organization of material, and a third experiment concerned with a particular 
Implication of subjective organization behavior* 

The material to be learned In the first experiment had groupings 
of words "built Into It," with most of the groupings defined via the 
concept dominance data provided by Underwood and Richardson (1956)* An 
experimental group of ^s learned the material via the study-sheet paradigm. 
One control group learned under identical conditions except that they did 
not have the opportunity to overtly organize the material on their study 
sheets* A second control group learned under "standard" free-recall 
conditions* The opportunity to overtly organize the material to be 
learned faclllt^^ted learning, as the experimental group achieved more 
words correct than either of the two control groups* Numbers o. words 
correct for the two control groups were almost Identical* Further, only 
those ^8 of the experimental group who took advantage of the opportunity 
to overtly organize the material were the ones to show better performance* 
Performance for the £s of the experimental group who showed no organization 
behavior on their study sheets was indistinguishable from the performance 
of the ^8 In the two control groups* The category definitions based on 
concept dominance are not salient and/or persuasive* £s tend to Ignore 
them, and appear to find other criteria for categorizing or organizing 
the words* The concept dominance categories are reflected In test per- 
formance to a lesser and lesser degree as learning progresses. In contrast, 
the subjectively defined categories are reflected In test performance to 
a greater degree as learning progresses. 

Jlhe material to be learned In the second experiment had pairs of 
words built Into it," with the pairs defined via the mutual relatedness 
data reported by Gofer (1965) and Marshall (1967) . All Ss learned via 
the study-sheet paradigm. A new measure (con) was developed for the 
c^slstency of organization from one trial to the next. Independent of 
the order In which the words were written* A second new measure developed 
(STR) gave an Index of the degree of stereotypy of organization for each 
word and/or the set of words, l*e*, the degree to which the different ^s 
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organized the material in the same way* A third new measure (CCP) pro- 
vides an index of the degree to which selected pairs of words are 
categorized into the same subjective categories, l*e., a measure of the 
common £ategorlzlng of the £alrs. The mutual relatedness variable was 
considerably more potent than the concept dominance variable of experi- 
ment one. Word pairs with high mutual relatedness strength are salient 
and persuasive. Most Ss utilize these pairs in their subjective 
organizations. Over all, however, the "built-in” pairs show a relative 
decrease In frequency of usage as learning progresses when compared 
with the frequency of usage of the subjectively defined categories. Even 
for the highest mutual relatedness levels clustering never comes close to 
complete or perfect. In terms of the mutual relatedness variable there 
Is strong and striking agreement between the results of the present study 
and those of Marshall (1967) , despite wide differences In procedure, 
measurement, experimental design, and ^ population* There Is no optimum 
size of a subjective cluster, and ^s use an average of approximately 15 
clusters for these 72 words, ^s whose tests reflect their study-sheet 
organization are the ones who learn fastest. The degree of correspondence 
between study and test organization Increases with trials for almost 
all ^s. Consistency of word order from one test to the next shows no 
consistent growth over trials, but consistency of organization (as 
measured by CON) from one study sheet to the next does show growth 
over trials* However, CON Is related (across Ss) to number correct 
only very early In learning. The STR and CCP measures very nicely 
reflect the variables "built Into" the stimuli, and thus exhibit a kind 
of concurrent validity. The CCP measure clearly points out the large 
source of variance due to total list context effects, and the need to 
understand these effects In order to account for subjective organization 
behavior and, by Inference, learning. 

The third experiment utilized lists of exhaustive categories 
(Cohen, 1963). For one group of ^s all categories were complete, for 
a second group one word was missing from each category, and for a third 
group one word was missing from one half of the categories. Exhaustive 
categories are highly salient and persuasive. The pre-experimental 
history of the ^s permits them to re-generate most of the words of the 
categories given only that they remember one of the words (or the category 
name). However, if one randomly selected word is omitted from a category, 
then ^ must also remember which word to leave out . It was hypothesized 
that the Incomplete categories would be more difficult to learn than the 
complete ones. The between ^s comparison (groups one and two) was 
confounded with length of list, and apparently list length was the more 
potent variable. The results were counter to the hypothesis. However, 
the within S^s comparison (group three) clearly supported the hypothesis, 
an ’ the data suggested that it should be possible to design a between-^s 
experiment which would also support the hypothesis. 
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Appendix A 



Index of stability of organization for word k from trial t-1 to trial t 

^\(t-l)(t)^ subject, i. With KK words in the list which is 

to be subjectively organized (words A, B, C, , , . k, , , , KK) build a one 
by KK matrix for each word, k, for trial t-1, and another for trial t. 
Considering all KK words this means, in essence, a square matrix KK by KK 
big for each trial. For each cell entry for word A (row A), for example, let 
the entry equal one if word A was in the same subjective category as 

word k, let c^j^ equal zero If word A was not in the same subjective category, 
and let equal one. This is done separately for each trial for a given 

subject. Define as: 

®k(t-l)(t) - "kc ' 

KK 




1C (®Ak(t-l)^ • ^^Ak(t)^* which is a count of the number of 
words categorized with word A on trial t-1 that were also 
categorized with word A on trial t (the minimum value is 1, 
for word A alone). This count is squared for the numerator 



KK 

“a ( t-1) « ^ ^Ak(t-l) • number of words categorized 



with word A on trial t-1, 

KK 

^ *^Ak(t) * '^^^ch is the number of words categorized with 
word A on trial t, 

\(t-l)(t) interpretable as the square of a correlation coefficient defined 
in terms of "common elements," In other words, it is the product of two 
proportions, n^^ / times n^^/ if the number of common elements 

equals the total number of elements for either trial t-1 or trial t, then 
the index of stability is simple equal to the other proportion. 



Thus 9 so tavg ve have two matrixes of the following form for 
each subject » i: 



Trial t-1 



A 

B 

C 

W D 

o • 

r 

d 

k 



ibc 



A 

T 

1 

0 

1 

• 



B 

T 

1 

0 

1 

• 



c 

0 

0 

1 

0 

• 



Word 

D 

T 

1 

0 

1 



k . . . KK 



Sum 

3 

3 

1 

3 



Trial t 



A 

B 

C 

D 

• 

k 



KK 



A 

T 

1 

0 

0 

• 



B 

T 

1 

0 

0 

• 

# 



c 

0 

0 

1 

0 



Word 

D 

0 

0 

0 



X ± JL 



k • • • 



KK 



Sum 

2 

2 

1 

1 



In the particular example depicted subject i formed a cluster of words 
Ay By and D on trial t«ly and for the portion of the table shown word C 
was not clustered with any other vordSyiaeay it was an isolatea In trial 
t the subject formed a cluster of words A and B, dropping word D from the 
cluster and making it an isolatea Word C remained an isolatea The index 
^A(t l)(t) equal the square of (1!1 + 1*1 + 0*0 + 0*1 + «aa) 

in the numerator; and the product of 3 times 2 in the denominator a The 
index of stability for word A in the sample would then be 4/(3*2) or 2/3a 
The index is the same for word By equal to 1 for word Cy and equal to 1/3 



for word D« 



A • 2 



II. 



Index of stability of organization for a cluster of words (cluster defined 
on trial t) from trial t^-1 to trial t, for any given subject, i. The 
clusters on trial t are 1, 2, ...m, M, The index, Rm(t-l)(t)> 
defined as: 



(t) “ * (t) ! "m • 8>>««>ation over all the 

th 

words in the m cluster (defined on trial t), and n^ is simply 
the nuniber of words in that cluster. Thus, simply 

the arithmetic average of the cluster. 



III. Index of stability of organization from trial t-1 to trial t for any given 
subject, i. This is simply the arithmetic average of all KK of the 

(t) “ (t) ‘ ^ 



IV. Index of stability of organization from trial t-1 to trial t for any given 
set of subjects. This is simply the arithmetic average of the *® 

for all of the subjects in the set, and will be labelled 

V. Index of stereotypy of clustering. This index reflects the degree to which 
a given word (e.g., word A) is clustered with the same words, across a set 
of £s. Indexes I through. IV are concerned with comparisons between trials 
t-1 and t, where the basic comparison is within a single S fos a single 
word, and averages are then found across words, .ad then across Ss» The 
basic comparison for the index of s^ 'eotypy (Index V) is also for a single 
word, but it is not within a single Sj and it involves only a single trial. 

Thus, for a word (e.g.. A) on a particular trial the index, r^ , is 
defined as follows: 




A - 3 



Given the set of (1> 2^ 3j •••9 if •••» each ^ puts the word 



A Into a cluster with from one (A Is an Isolated word) to n^ words 

(1 e n • 1 other words) • All possible pairs of ^s are considered, one 

(1 and j> 

pair at a time. For each pair of £s^the Index r^^^ Is determined, voere 
the Index reflects the similarity or overlap of the words which each S 
associated with word A, l.e., put Into a cluster with word A. The basic 
Index Is similar in form to the previous indexes In that the square of the 
count of the number of words both of a pair of £s clustered with word A 
goes In the numerator; and the denominator Is simply the nusaber of words 
clustered with word A for one of the Ss (1), times the corresponding number 
for the other S (J) of the pair. The basic data may be entered Into a matrix 
of the following form: 



For Word A 



S I 

u 2 

b 3 

J 4 

e • 

c • 

t 

s 1 



A 

1 

I 

1 

I 

fj 



BCD 

10 1 
111 
0 10 

0 0 0 



J. J. A 



k 



AAA 




• • • 

t • • 

• • • 

• • «t 



^A1 

3 - 

4 ■ 

2 » 
1 - 



“A1 

UA2 

“A3 

“A4 



“A1 



J 




N 



“AH 




^AA ^AC ^AD ^Ak 

4 2 2 2 





There would then be a similar matrix for each of the KK words. 

For the Word A matrix: for each row (subject) there Is a "1"' in each 
column corresponding to words which that subject Included In the same 



A - 4 



category 

subjectlve^aa word A; and a zero in the remaining cells of the row. In 

other words, let each cell entry, c^ , equal ”1" if for subject i the 

kth word was included in the same subjective category with the word A. 

All other c's equal zero. The n^^ are simply the sums of the entries in row 

i, and, for subject i, they are simply the number of words in the subjective 

category within which word A occurred. The »Ak are the sums of the columns, 

and they are simply the number of subjects who iiicluded word k in the same 
(Note: for a complete table for Word A, - N) 
subjective category as word A.y^ The n^. is simply the total of the ones 

in the entire table for word A, and it is the sum total of the number of 

words categorized with word A across the set of N subjects. 

All possible pairs of subjects (rcws) in the above Word A matrix are 

to be considered. There are, of course, (N/2)(N*1) different pairs. For 

each pair of subjects, i and j, define an index of common categorization as: 



'aIJ ■ "AiS / (“ai> * <“Aj>« 

KK 

“Alj ° <®Alk^ * (®AJk^ 

n^^ » Sum of row i, as defined in the Word A matrix above, 
n. . » Sum of row j, as defined in the Word A matrix above<> 



The n^^j are the counts of the words which both subjects i and j 

categorized with word A. Thus, the numerator for r^j ^2 would be the 

square of the quantity (l«l + l«l-fl*0 + l*l), which equals 
2 

3 . The denominator equals 3 times 4. And, r^^^^ * 

r. .. for the example depicted in the Table for Word A are: 



*A12 " • *) - 3/4 

'au " 



ERIC 

^ 



A • 5 



'a14 • 1 • 1) = 1/3 

'a23 ■ 

'a24 “ 

'a34 “ 

The index of stereotypy for Word A (for the set of subjects, N) is then 
defined as the arithmetic average of all of the t^j^j* ^ subjects 
there are (N)(N-l)/2 different pairs of subjects, hence that many 



different r 



Tn N 



(5) r^ - 2/(N)(N.l) 



^ ^ r ( . 

Alj *• 



/: 



1«1 j»l 

L”' 

For the above example r^ ■ 2/(4 • 3) [^3/4 + 1/6 + 1/3 + 1/2 + 1/4 + 1/2^ 
(1/6) . (30/12) - 0,4167. 

If only the fir;?t three of the subjects are considered the example 



would be: 



>^A12 “ 
'a13 ‘ 
*A23 “ 



= 1/3 [(9 + 2 + 6)/12j - 0.472. 



A computer program vlll be written to work with sets of 20 (or fewer) 
subjects. TXjo, or more, sets of subjects will provide reliability 
estimates for the index. The same thing that was done for Word A above 
is, of course, done for each of the KK Words, thus yielding an rj^ for 
each of the KK words. 

VI. Index of stereotypy of clustering for sets of words. This is simply the 
arithmetic average of the r^^ for the words in the set. Sets mey be 
defined by the experimenter (on apriori grounds, or on the basis of prior 
experimental evidence) to include any number of words from 2 through KK, 
The average across all KK words is, of course, the average stereotypy for 
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the tatal set of stimulus words used. 

A • 6 



VII, Index of common clustering for any specified pair of words. The intent 
is not to try to look at all possible pairs of words, but to use this 
index to examine only those particular pairs which are of special interest 
on apriori grounds or on the basis of prior experimental evidence. One 
obvious index which might be considered is simply the proportion of 
subjects, out of the set of N subjects, who include the particular pair of 
words in the same subjective category. For example consider the pair of 
words A and B, Utilizing the example depicted in the Table for Word A 
(the Table for Word B, depicted below, could also be used) this index 
would simply be “aB / «AA (which equals from the Word B Table, 

below), or 2/4 = 1/2, Thus, one half of the four subjects depicted in the 
example formed subjective categories such that words A and B were included 
within a single category. This simple index fails to take into account the 
sizes cf the subjective categories in which words A and B are imbedded. 

If most subjects include these words in categories with many other words, 
i,e,, large categories, then these two words would fall within the same 
category ^ chance more frequently than they would under the coriitions 
in which the two words are usually imbedded in small categories. For this 
reason a somewhat more complex index will be used. It is similar in form 
to the previous indexes. It is de./'.ad as: 

(6) where 

“ab “ the sum, across the set of N subjects, of the numbers of 

words in the categories within which the pair of words A and 
B may be found, 

n^, the sum, across the set of N subjects, of the numbers of 

words in the categories within which the word A may be found 

* the sum, across the set of N subjects, of the numbers of 

words in the categories within which the word B may be found, 

A - 7 



In order to illustrate the computation of r^^ an illustrative table for 
Word B is also needed. 



For Word B 



s 

u 

b 

e 

c 

t 

s 






Bk 



1 

2 

3 

4 






N 



A 

T 

1 

0 

0 



“ba 

2 



B 

T 

1 

1 

1 

• 



c 

0 

1 

0 

0 

• 



D 

T 

1 

0 

0 

• 



^BB ^BC ”bD 



• • • 

• c • 

# # • 

• • • 



N. 



Bk 

1 



• • • 






Ji 

3 = 

4 » 

3 = h 
1 ® n 



*B1 

B2 

B3 

B4 



n 



Bi 



n 



Bj 






^BKK “b« 
1 11 



The quantity n^^ may now be obtained in any one of three ways. The first 
is from the Word A Table and is i 

N 

“ab ** ^®AiB^ * ^“Ai^* 

The second is from the Word B Table and is: 

N 

“ab * Ji^^BiA^ * ^“Bi^* 



The third involves the Tables for both Word A and Word B, and is: 

N KK 

“aB “ I Z ^"Aik> • <*^3ik>- 
i»l k»A 
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All three yield identical answers. The quantities n^^ and n^, are simply 
the sums of the ones in the Word A and Word B Tables^ respectively, and 
are shown in the lower right hand corners of the Tables. 

Using all four of the subjects depicted in the Tables: 



-AB 



r / (10) • (11) « 0.445. 



Using only the first three subjects: 
*AB “ 

And using only the first two subjects: 



'AB ■ ^ 



(7) = 1.0. 

VIII. Further computational Illustrations. The computation of r^, the Index 
of stereotypy of clustering for Word B (Section V), la done from the 
example depicted In the Table for Word B as follows; 



4) ^ 3/4 
3) = 1/9 
1) = 1/3 
3) = 1/12 
1) “ 1/4 
1) » 1/3 



'B12 

'B13 “ / <3 

'b14 = / <3 

'b23 " 

'b24 “ ^ ^ 

TB34 “1^/(3 

And tg = (2/12) [(27 + 4 + 12 + 3 + 9 + 12) / (36)] = 67/216 ' 
0.310. 

!Ehe computation of r^ , r^^ , and r^^ requires the use of the Table 
for Word C. An example Table is as follows: 



A - 9 




For Word C 





A B 


C D • 


8 


10 0 


T 0 


u 


2 11 


1 1 


b 


3 10 


1 0 


J 


4 0 0 


1 0 


e 


• 




c 


• 




t 


• 




s 


1 

• 






• 

2 1 


4 1 




'C12 " / (1 • 


4) « 1/4 




'C13 ' / a • 


2 ) » 1/2 




'C14 “ / (1 • 


2 ) * 1/2 




'c23 “ ^ / (4 * 


2 ) = 4/8 = 1/2 




'C24 " • 


2 ) » 1/8 




'C34 “ / (2 • 


2) « 1/4 




And r « (2/12) 
c 


(2 + 4 + 4 + 4 



0.354. 



Using all four of the subjects depicted In the Tables: 



AC 



6 / (9) . (10) = 0.400. 



Tgc “ / (9) • (11) = 0.162. 

Using Just the first three subjects depicted: 

*AC ” 

rgc = 4^ / (7) • (10) = 0.229. 

Using just the first two subjects depicted; 

*AC ^ 

tflc = 4^ / (5) • (7) » 0.457. 

In summary of the samples depleted; 

Subject 1 clustered words (ABD) (C) 

" 2 " " (ABCD) 

" 3 " " (AC) (BkK) (D?) 

" 4 " " (A) (B) (Ck) (D?) 

A - 10 



er|c 



to to ^ H-> 



» 0.472. 



tj » 0.310. 

Tg = 0.354. 

And foe all four subjects: 

'ab " 

'ac “ 

'bC “ 



For just the first three subjects: 



'ab “ 

'AC “ °-5”- 



'bC “ °*229. 



For just the first two subjects: 

t^B “ 1.00. 

'ac “ 

'bC ° 



A - 11 



er|c 



APPENDIX -hB 



Derivation of the Bousfield formula as reported by Dallett 

(1964) . 



Repetitions expected by chance (RC) equals 



RC = 



^ 2 

E iny 

i=l 



k 

I 

i«l 




1 



k k 

2 (ni)2 - E 

ifl i«l 

k 

£ n^^ 
i«l 



k 

£ 

i«l 



(nj^)(n^ - 1) 



k 

£ 

i«l 



n 



i 



where n^ is the number of words of the i^^ category on a test, and 

k is the number of different categories on the test represented by 
at least one word* 



Derivation ; 

1) For any n^ words the number of possible pairwise couibinations 

(i*e., repetitions, or adjacencies) equals (l/2)(n^)(n^ -• !)• It 

is simply the number of combinations of n. things taken two at a 

th ^ 

time. For the i category it is the number of possible "succes- 
ses” (i.e«, adjacencies). 

2) For k different categories the total number of possible "succes- 
ses" is simply the sum of the number of successes for each of the 
categories, or 



k 

(1/2) E (ni)(n^ - 1) 



3) The total number of words on a test equals £ n. 

i«l ^ 



N. 



4) The total number of possible pairwise combinations of N words is 
(1/2)(N)(N - 1). 

5) Of the (1/2)(N)(N - 1) total possible pairs there are 

k 

(1/2) Z (nj)(n£ - 1) possible "successes". Dividing the number 
i=»l 

of possible successes by the total number of possible pairs yields 
the probability of any given pair, drawn at random, being a success. 



Or, 



Prob* of any random pair 
beitws a "success*' 



I (n.)(ni - 1) 

1«1 

N (N - 1) 



6) In any ordered list of N words there are, in fact, N - 1 pairs 
(adjacencies). 

7) Since any list of N words yields N - 1 pairs, and each pair has 
the same probability of being a "success," then the expected number 
of "successes" (i.e., chance adjacencies, or, RC) equals the number 
of pairs times the probability that a pair will be a "success". 
Thus, 



RC 



I (nj^)(n^ - 1) 
i«l 

N (N - 1) 



X (N - 1) “ 



I (n^) (n. - 1) 
±•1 

N 



Since N « E n^i^ (paragraph 3), 
iol 



RC 



E (niKn. - 1) 

i»l 

k 

E nj|^ 
i«l 






Special case where all cateRorles are, at most , two words ~bigi 

All tests have been scored for repetitions possible (RP), 
and for number correct (C). If all categories being examined for 
on the tests are, at most, two words big, then the RP and C mea- 
sures for a given test may be used to calculate the RC measure for 
that test. Such is the case for the JE-defined categories of Experi- 
ment Two. For every "repetition possible" on the test n^ ■* 2, or, 

nji^ « 2 RP times. For every occurrence on the test of a single word 
of a pair, n^ « 1. For the present situation can take only the 
values of one or two. When n^ » 1 it has no effect on the numera- 
tor of the formula for RC, which is E(nj[)(n£ - 1), since (n^^ - 1) 

*» 0. Thus, the numerator is simply equal to (2) (2 - 1)(RP). The 
denominator, En^, is simply equal to the number of words correct, 

C. Hence, 



B-2 



RC « 



( 2) (2 - 1)(RP) 2RP 
C “ C 

This formula has some Interesting, counter-intuitive properties. 
First, lie £ 1.0 regardless of the value of C, since 2RP ^ C. In 
other words, no matter how many correct items there are on a test, 
the expected number of chance repetitions will be equal to, or less 
than, one. Second, if all words on the test are members of pairs 
of words, i.e. , if 2RP « C, then RC = 1 regardless of the number of 
words on the test. 

Ihe quantity RO - RC for this special condition in which all 
£-deflned categories are pairs is equal to 

RO - RC » RO - 2(ICP)/C 

For the plots in Fig. 7 there are three ^-defined pairs for each 
plotted point, hence RO for any one ^ can take the values 0, 1, 2, 
or 3. RC can vary between 0 and 1. Uence the quantity RO - RC 
must vary between 2 (for perfect clustering of all three pairs) , 
through 0 (for the case where RO - RC, i.e., when both measures 
equal one or zero), to minus 1 (for the case where RO ** 0 and all 
six words are present, but scattered throughout the test list, 
i.e., RP « 3, C « 6, and RC * 1). 
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abstract 








Ss impose some organization on the material during the process of learning. An 
experimental paradigm is described which permits the obi ective scoring of each ^ s 
subjective organization of the material on each learning trial. The central feature 
of the paradigm is the utilization of an essentially unstructured study sheet which 
each ^ prepares anew, for his own use, during each learning trial. New indices are 
developed for measuring: a) the consistency of organization (independent of 

sequential order) from one trial to the next; b) the stereotypy of organization 
(i.e., degree of "sameness") across ^s on any one trial; and c) the extent to which 
specific pairs of words are to be found in the same subjective categories for 
different ^s. Three experiments are described. The extent to which ^s utilize _E- 
defined organization is evaluated. "Concept Dominance" is not persuasive, but high 
degrees of 'Mutual Relatedness" are. Much of the subjective organization behavior 
depends on the total set of items involved. Overt organizing, and the utilization 
of it, facilitates learning. There is no optimum size of a subjective cluster. 
Consistency of word order from one test to the next does not increase with learning, 
but consistahcy of organization does. The other two new indices reflect the Mutual 
Relatedness variable. The third experiment ind'* -ates that in a mixed list of items, 
exhaustive categories are learned faster than are exhaustive categories with one 
item missing. 



